******Replication Instructions for "Civic Education in High School and Voter Turnout in Adulthood."

*This study makes use of several different files from Add Health. These are restricted files and must be obtained direclty from Add Health. They are:

*Wave III and Wave IV Add Health surveys and W1 Parent Data:
*https://www.cpc.unc.edu/projects/addhealth/documentation/restricteduse/datasets/index.html#CoreFiles
*The datasets should be merged (using respondent ID number) to create one file

*You also need the Wave III Academic Transcript Social Studies and Civic Coursework (ATRCVC) Data:
*https://www.cpc.unc.edu/projects/addhealth/documentation/restricteduse/datasets

*Finally, you need the sibling identification file for the fixed-effects analyses: 
*https://www.cpc.unc.edu/projects/addhealth/documentation/restricteduse/datasets/index.html#SiblingFiles

*Information on the process that researchers must follow in order to obtain datasets can be found here: 
*https://www.cpc.unc.edu/projects/addhealth/contracts

***********FIRST STEP: SET UP THE CIVICS/SOCIAL STUDIES COURSE DATA********************
*This code is run on the Academic Transcript dataset 

*generate number (within each respondent set of rows) to indicate how many rows each respondent has (1 row = 1 course). 
*For example, if there are 10 rows for a given respondent, this variable is 1 for row 1, 2 for row 2....10 for row 10. 
*This allows us to do a regression using just the first row (since each person is in the dataset multiple times). The code below puts the counts into each row (so we only need to use the first row for each person). 
sort aid
by aid: generate n1 = _n

*turn ATRCVC01 (SS Course Content Category variable) into a count of the number of each course type for each respondent
*totalsscourses (sum of all types) should be same as number of rows for each person 
tab atrcvc01, gen(coursecategory)
egen totalexplearningcourses = total(coursecategory1), by(aid) 
egen totalservicelearningcourses = total(coursecategory2), by(aid) 
egen totalcivicskillscourses = total(coursecategory3), by(aid) 
egen totalissuescourses = total(coursecategory4), by(aid) 
egen totalmarginalizedgroupscourses = total(coursecategory5), by(aid) 
egen totalamericanhistorycourses = total(coursecategory6), by(aid) 
egen totalmulticulturalcourses = total(coursecategory7), by(aid) 
egen totalpolknowledgecourses = total(coursecategory8), by(aid) 
egen totalothersscourses = total(coursecategory9), by(aid) 
gen  totalsscoures = totalexplearningcourses+ totalservicelearningcourses+ totalcivicskillscourses+ totalissuescourses+ totalmarginalizedgroupscourses+ totalamericanhistorycourses+ totalmulticulturalcourses+ totalpolknowledgecourses+ totalothersscourses

*Save this file so that it can be merged into the dataset with Wave 3, Wave 4, and Parent data


***********SECOND STEP: RECODE ADD HEALTH SURVEY DATA*************************
*In the dataset that contains the Wave 3, Wave 4, and Wave 1 Parent survey data, you will need to generate and recode the dependent variables and controls 

*gen male dummy and recode so that 1=male 
gen male =  BIO_SEX3 
recode male 2=0

*generate mom's education and get rid of missing data from mom's education measure 
gen  momeducation = PA12
recode  momeducation 96=. 
recode  momeducation 10=0 

*parent civic org participation 
gen parent_civicorg = PA27E
recode parent_civicorg 6=.

*parent income 
gen parentalincome = PA55 
recode parentalincome  9996=. 

*cognitive ability 
gen cog_ability = AH_PVT

*respondent race variables 
gen white = H3OD4A
recode white 0=0 1=1 6/9=.
tab white 

gen black = H3OD4B
tab black
recode black 0=0 1=1 6/9=.

gen hispanic =H3OD2
recode hispanic 0=0 1=1 6/9=.
tab hispanic 

gen asian = H3OD4D
tab asian
recode asian 6/9=.
tab asian

tab H3OD4C
gen nativeam = H3OD4C
recode nativeam 6/9=.

*religious attendance 
gen religosity = H3RE24
recode  religosity 96/99=. 

*age 
gen calcage3 = CALCAGE3 

*wave 3 highest educational attainment 
gen w3_education = H3ED1
recode w3_education 96/99=. 

*wave 4 highest educational attainment 
gen w4_education = H4ED2 
recode w4_education 96=. 
*convert scale to years of education 
recode w4_education 1=8 2=10 3=12 4=13 5=14 6=14 7=16 8=17 9=18 10=19 11=20 12=18 13=19 

*turnout wave 3
gen w3_turnout = VOTE_2000 
*gen w3_turnout = H3CC8
*recode w3_turnout 6/9=. 1=1 0=0

*turnout wave 4
gen w4_turnout = VOTE_GEN
*gen w4_turnout = H4DA27
*recode w4_turnout 96/98=. 


******THIRD STEP: Merge transcript dataset and survey dataset and keep the respondents who were successfully merged*********
*open the transcript dataset 
merge m:m aid using "/Users/Aaron/Desktop/Replication Files BJPS/full ah dataset.dta"
keep if _merge==3
save "/Users/Aaron/Desktop/Replication Files BJPS/full ah dataset.dta", replace



**********ANALYSES FOR TABLE 1 REGRESSION MODELS***********
*Note that because respondents are in the file multiple times (each row is a respondent/course), you restrict the models to the first row for each person (n1==1) and also make sure that there is no missing data on the courses measure (atrcvc01!=.) 

reg w3_turnout totalsscoures if n1==1 & atrcvc01!=.
est store m1a
reg w3_turnout totalsscoures male calcage3 w3_education  momeducation black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 & atrcvc01!=. 
predict m2a
est store m2a
reg w3_turnout totalsscoures if n1==1 & atrcvc01!=. & m2a!=. 
est store m1a


reg w4_turnout totalsscoures if n1==1 & atrcvc01!=.
est store m3a 
reg w4_turnout totalsscoures male calcage3 w4_education  momeducation black hispanic asian nativeam parent_civicorg parentalincome religosity  cog_ability if n1==1 & atrcvc01!=.
predict m4a
est store m4a 
reg w4_turnout totalsscoures if n1==1 & atrcvc01!=. & m4a!=. 
est store m3a


reg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses  if n1==1  & atrcvc01!=.
est store m5a 
reg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 w3_education  momeducation  black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 &  atrcvc01!=.
predict m6a
est store m6a 
reg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses  if n1==1  & atrcvc01!=. & m6a!=. 
est store m5a 


reg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses if n1==1 &  atrcvc01!=.
est store m7a 
reg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 w4_education  momeducation  black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 &  atrcvc01!=.
predict m8a 
est store m8a 
reg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses if n1==1 &  atrcvc01!=. & m8a!=. 
est store m7a 

*this code writes the models to a nice regression table (Table 1 in the paper) using the LaTex typesetting system 
esttab m1a m2a m3a m4a m5a m6a m7a m8a using Table1revisedOLSmodelswithsameNs.tex, style(tex) cells(b(star fmt(%9.3f)) se(par)) 


********SAME MODELS AS ABOVE BUT USING ORDERED LOGIT AND LOGIT (which is a robustness check we include in Online APPENDIX)************

logit w3_turnout totalsscoures if n1==1 & atrcvc01!=. & m2a!=.
est store m1c
logit w3_turnout totalsscoures male calcage3 w3_education momeducation black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 & atrcvc01!=. 
est store m2c


ologit w4_turnout totalsscoures if n1==1 & atrcvc01!=. & m4a!=. 
est store m3c
ologit w4_turnout totalsscoures male calcage3 w4_education momeducation black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 & atrcvc01!=.
est store m4c 

logit w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses  if n1==1  & atrcvc01!=. & m6a!=. 
est store m5c
logit w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 w3_education momeducation  black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 &  atrcvc01!=.
est store m6c  

ologit w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses if n1==1 &  atrcvc01!=. & m8a!=. 
est store m7c
ologit w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 w4_education momeducation  black hispanic asian nativeam parent_civicorg parentalincome religosity cog_ability if n1==1 &  atrcvc01!=.
est store m8c

*Generate Table 1 in Online Appendix in LaTex format
esttab m1c m2c m3c m4c m5c m6c m7c m8c using Table1revisedappendixLogitOrderedsameNs.tex, style(tex) cells(b(star fmt(%9.3f)) se(par)) 

 
******FOURTH STEP: Create a copy of the merged dataset that is being used above so that just the sibling subsample can be analyzed********

*With that dataset open
 
keep if n1==1
*this just gets rid of the repeated lines for each person (since everyone has a line for each course they had). Since I already created the course variables, we only need to keep each person's first line (the course measures are on each line for each person)

drop _merge
*gets rid of the merge identifer that was previously created (when merging in the above section) 

*Then, merge in the sibling file into the dataset 

merge m:m aid using "/Users/Aaron/Desktop/sibs.dta"

*keep those respondents who were merged in 
keep if _merge==3

*get rid of people who are the only family member in the file (since we need at least 2 siblings for FE analysis)
egen numberperfamily = count(aid), by(famid)
tab numberperfamily 
drop if numberperfamily==1 

*make sure there is no missing data on DV (since we need complete family data--no one in a family can be missing data)
drop if w3_turnout ==.

*Then, make sure to drop any family that is incomplete due to somone missing data on the DV (turnout) 
egen numberperfamilyX = count(aid), by(famid)
tab numberperfamilyX 
drop if numberperfamilyX==1
tab numberperfamilyX 

*Now, generate a variable that gives a number to each person in a family unit 
bys famid: gen personnumber=_n

*setup data for fixed-effects analysis 
xtset famid personnumber

************************ANALYSES for Tables 2 & 3 in Online Apendix, Figure 1 in Paper, and Figure 1 in Online Appendix *************************
*USING W3 Vote as DV 
*effect of total number of ss classes 
reg w3_turnout totalsscoures male calcage3 if atrcvc01!=., cluster(famid)
est store m1
*effect after adding FEs 
xtreg w3_turnout totalsscoures male calcage3 if atrcvc01!=.  , fe cluster(famid)
est store m2

*effect of each type of class/experience (variables are counts of how many classes in each category)
reg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 if atrcvc01!=., cluster(famid)
est store m3
*effect of each type of class/experience adding FEs
xtreg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3  if atrcvc01!=. , fe  cluster(famid)
est store m4 

*To create the first panel in Figure 1 in the Appendix run the following code: 
coefplot m1 m2 m3 m4, xline(0) drop(_cons) levels(95 90) sort keep(totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses totalsscoures)

*To create the left plot in Figure 1 in the paper, use this code: 
coefplot m1 m2 m3 m4, xline(0) drop(_cons) levels(95 90) sort keep(totalexplearningcourses totalsscoures totalpolknowledgecourses  totalmulticulturalcourses totalmarginalizedgroupscourses)

*To create a LaTex table of results (Table 2 in Appendix) 
esttab m1 m2 m3 m4 using Table.tex, style(tex) cells(b(star fmt(%9.3f)) se(par)) 


******Create a copy of the merged dataset then conduct the W4 FE models which have a slightly lower N

*Get rid of people with missing data on the Wave 4 turnout measure 
drop if w4_turnout==.

*Then, count the number of people per family (now that missing data is gone) and get rid of any one with just 1 member per family (since we need complete family units) 
egen numberperfamilyY = count(aid), by(famid)
tab numberperfamilyY 
drop if numberperfamilyY==1

*Now, generate a variable that gives a number to each person in a family unit 
bys famid: gen personnumberX=_n

*Run Models 

xtset famid personnumberX
*effect of total number of ss classes 
reg w4_turnout totalsscoures male calcage3 if atrcvc01!=., cluster(famid)
est store m5 
*effect after adding FEs 
xtreg w4_turnout totalsscoures male calcage3 if atrcvc01!=. , fe cluster(famid)
est store m6

*effect of each type of class/experience (variables are counts of how many classes in each category)
reg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 if atrcvc01!=., cluster(famid)
est store m7
*effect of each type of class/experience adding FEs
xtreg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 if atrcvc01!=. , fe cluster(famid)
est store m8 

*To create second plot in Figure 1 in the Appendix run the following code: 
coefplot m5 m6 m7 m8, xline(0) drop(_cons) levels(95 90) sort keep(totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses totalsscoures)

*To create the right plot in Figure 1 in the paper, use this code: 
coefplot m5 m6 m7 m8, xline(0) drop(_cons) levels(95 90) sort keep(totalexplearningcourses totalsscoures totalmulticulturalcourses)

*To create a LaTex table of results (Table 3 in the Appendix) 
esttab m5 m6 m7 m8 using Table.tex, style(tex) cells(b(star fmt(%9.3f)) se(par)) 


******Adding birth order as a control variable (robustness check included in the Appendix)*********
*create copies of each dataset from above (since N will decrease slightly due to some missing data on birth order)

*For Wave 3 models  
*recode birthyear to get rid of DK/refuse responses 
tab H1HR15
recode H1HR15 97=. 98=. 

*get rid of missing data on birth year variable  
tab w3_turnout , m
tab  H1HR15, m
drop if  H1HR15==.

*get rid of incomplete families (since need complete data on each family) 
egen numberperfamilytest2 = count(aid), by(famid)
tab numberperfamilytest2 
drop if numberperfamilytest2==1

*generate number identifying which person each respondent is within each family
bys famid: gen personnumberX=_n

*models 
xtset famid personnumberX
reg w3_turnout totalsscoures male calcage3 H1HR15 if atrcvc01!=., cluster(famid)
est store m1
xtreg w3_turnout totalsscoures male calcage3  H1HR15 if atrcvc01!=.  , fe cluster(famid)
est store m2
reg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 H1HR15 if atrcvc01!=.,  cluster(famid)
est store m3
xtreg w3_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3  H1HR15 if atrcvc01!=. , fe cluster(famid)
est store m4

*generate Table 4 in appendix in LaTex 
esttab m1 m2 m3 m4 using w3birthorderClusteredSEs.tex, style(tex) cells(b(star fmt(%9.3f)) se(par)) 


*For wave 4 models 

*recode birthyear variable 
tab H1HR15
recode H1HR15 97=. 98=. 

*get rid of missing data 
tab  H1HR15, m
drop if  H1HR15==.

*get rid of families that are incomplete due to birth year missingness 
egen numberperfamilytest2 = count(aid), by(famid)
tab numberperfamilytest2 
drop if numberperfamilytest2==1

*models 
reg w4_turnout totalsscoures male calcage3 H1HR15 if atrcvc01!=., cluster(famid)
est store m1 
xtreg w4_turnout totalsscoures male calcage3 H1HR15 if atrcvc01!=. , fe cluster(famid) 
est store m2
reg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 H1HR15 if atrcvc01!=., cluster(famid)
est store m3
xtreg w4_turnout totalexplearningcourses totalservicelearningcourses totalcivicskillscourses totalissuescourses totalmarginalizedgroupscourses totalamericanhistorycourses totalmulticulturalcourses totalpolknowledgecourses totalothersscourses male calcage3 H1HR15 if atrcvc01!=. , fe  cluster(famid)
est store m4  

*generate Table 5 in appendix in LaTex 
esttab m1 m2 m3 m4 using w4birthorderClusteredSEsv2.tex, style(tex) cells(b(star fmt(%9.3f)) se(par) ) 







 
 




